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Abstract 

The Mat-Sea-Cai Oral Proficiency Tests are a series of 
comparable grammatical structure tests. They have been developed 
in frix languages: English, Spanish, Cantonese, Mandarin, nokano 
and^Tagalog, Their prypose is to identify linguistic skills and 
deficiencies of primary school children grades K through 4, This 
research reported on the psychometric qualities-of the English and 
Spanish editions. 

Reliability was computed by the method of internal equivalence- 
Coefficients were .91 on the English test and .94 on the Spanish 
test. Point biserial coefficients were calculated as the discrimin- 
ation index. Results varied by subtest (Listening ComprehensiDn , 
Sentence Repetition, and Structured Response), Factor analysis, via 
principal factoring with varimax rotation, was employed to Identify 
Item pools. 

Results indicatid that approximately 30 percent of all original 
items require revision. (These tests are labeled "Field Test Edition^' 
by the authors.) The remaining items possessed good to excellent 
discrimination indicies, and difficulty levels appropriate for 
criterion referenced measyres* 
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LANGUAGE ASSESSMENT AND THE MAT-SEA-CAL TESTS 
Language assessment might be defined as the systematic attempt to ascertain one's 
ability to effectively receive and produce the elements of verbal expression. 
Linguistic appraisal measures the ability to synthesize profLciencles m phonology, 
morphology, syntax and lexicon Into a meaningful Gestalt. 

A major objective of primary education ts facllLtating the acquisition of this linguistic 
integrated configuration. Indeed, fostering a unified ponstellatlon communidatlon 
proflciencle^i unlocks educational opportunities for the student. Most educational 
experiences raquLre that by a'pproxlmately fourth grade chadren are basically 
functional In communicative arts, of v/hlch speech is pivotal.^ The adequacy of 
this assumption Is challenged, yearly It seems, by U.S.O.E. statistics relative 
to the number of functionally illiterate adults in the population, the decline of 
achievement scores, the percentage of pupils falling basic skills' tests, and the 
like. 

In the classroom the problem has been the Identification of the children's Initial 
language range. This knowledge is essential In order to determine the basis from 
which to commence instruction. Pac;tors beyond the control of the school have 
Influenced initial language development, among which are: 

1. The nature of the child's pre-school linguistic environment. ■ " 

2, Parental personality traits and attitudes, 

3, Degree of association with adults. 

4. Child rearing practices in the home. 

5. Number of siblings and orderal rank among them, 

6, Parental attitude toward their own speech community 
and toward second language group(s) . 2 
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• The complexity of personal and envtronmental inputs presents the ehalleng© to 
educators In rendering equal opportunity to pupils, "Merely providing the sam© 
educational opportunity to all students, does not satisfy the law \vhen some students 
are effecttvaly foreclosed from any meaningful education by a language barrier."^ 
Ascertaining the extent and degrees of linguistic diversities, and then accomodating 
rich varieties of background into a series of organized learning activities constitute 
both the science and art of teaching. 

Linguistic pluralism is common to the United States, Approximately 10 mllUon 
predominantly Spanish speaking individuals reside in thlg country. The city of 
Seattle, as an example^ is home to dozens of malor dialects within Its communities. 

The humanist educator would assert that schooling should augment cognitive/affective 
growth regardless of language heritage. The non-native English speaking student 
offers the rich potential of first-hand sharing of culture and language. Even 
deficiencies of the native English speaker require special attention, so as to 
permit all children to be Inundated with schooling's benefits. 

Generically, the Mat^Sea-Cal Oral Proficiency Tests are a systematic, objective 
vehicle for determining aural=oral competencies. They are a series of six comparable 
grammatical structure tests (In Cantonese, English, Ilokano, Mandarin, Spanish, 
and Tagalog) . 

Authored by Drs , Betty and Joe Matluck, with the support of the Center for Applied 
Linguistics, these Instruments v/ere designed to 

1. determine the child's ability to 

a. understand and produce the distinctive 
characteristics of the spoken language,, 

b. eKpress known cognitive concepts In 
the language, and 

c* handle learning tasks In the language; 
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2, provide placement and Instructional recommendattons 
with respect to alternat'? programs, such as English 
Instruction and bilingual education*^ 

Subject matter ancompasses eight concept groups: the skills of Identifying, 
classifying, quantifying, interrogating, and negating, and of showing relation- 
ships such as spatial, case, and temporal. These concepts are assessed through 
three types of communication manifestations, or modes: listening comprehension, 
sentence repetition, and structured response. Eighty-one sentences comprise 
each language test. Administration Is through a standardized, taped stimulus and 
a series of supporting visuals. Administration time averages 25-40 minutes, though 
no time restriction Is placed on respondees. Testing and scoring Instructions are 
fully described; samples of correct responses ^are provided with each Item. In 
section one the eKamlnee selects from one of three pictures. Sections two and three 
require verbal response, and both sections are completely taped Item-types are 
multiple choice and short answer. 

Initial fteld-testlng was conducted in Seattle, Subsequent administrations have been 
completed In school districts In the states of California, Idaho, Texas and Washington 

The Mat-Sea-Cal Tests are In field test form, and are clearly marked as such. Their 
development has followed a standard, psychometric process outlined for Instruments 
of the type,^ This paper focuses on statistical qualities of the English and Spanish 
tests, as demonstrated through field-testing to date. 

' ~" ^ ^ " - ~ - I^ELIABILITY 

The first desirable characteristic of any Instrument Is the ability to demonstrate 
consistency in measurement over a series of administrations or Individuals, Known 
as reliability, it depicts the degree of certainty to which one may base decisions 

7 



on gathered data. Without reasonably high reliability (.80 and up), the constructs 
under Investigation have only tentatively been measured by the given Item sequence. 

Mat-Sea^Cal test items are dlchotomously scored, Ue., correct or Incorrect* 
Summation of Item response^scores (of which there are 108 on the English test, and 
118 on the bpanlsh counterpart) are converted to percentage scale , with normal score 
distribution: 0 to 100. Thus, an appropriate method for calculating reliability Is 
that of internal equivalence, mathematically stated by the Kuder^Richardson formula. 

This method was selected over other common procedures (such as alternate form or 
test-retest) for following reasons. Only one form per Mat-Sea-Cal test exists; thus ^ 
logistics precluded alternate form calculations. Being a power test, alternate form 
and Interval equivalence coefficients would be nearly Identical, Furthermore, 
psychometric theory does not endorse creation of second forms when only one is 
necessary for research or practical use,^ 

The test-retest approach also appeared less desirable for reliability computations* 
Retest coefficients require several months between administrations of the Instrument. 
In this Instance reliability would be affected to an unknown extent by a combination 
of schooling and maturatlonal factors on oral proficiency development. By contrast, 
a short retest Interval employed with a one=form instrument Introduces a "memory" 
effect to examinees' performance on the readmlnlstratlon. 

The overall English test reliability coefficient was computed to be *91, on the 
Spanish test ,94, Calculations on subsamples (divided by categories within 
ethnicity, sex, geography, and educational attainment) ranged from ,82 to ,96, 
Sample size for the overall coefficients was 3000* 
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Three gpeclfic conclusions may be drawn from these findings. First, the English 
* and Spanish versions of the test are sufficiently reliable to permit further develop- 
ment and refinement on the present sample of Items. In other words, the state of 
measurement consistency is such that a complete rewriting of test items Is 
unnecessary. Second, sufficient confidence may be placed in data generated by the 
Mat-Sea-Cal that other avenues of linguistic research may be supported by Its 
Information base. Third, the Mat-Sea-Cal Tests appear to be relatively homogeneous 
In nature. Reliability coefficients, for whole tests. In excess of .90 usually are 
an Indication of homogeneity, 

ITEM DIFFICULTY INDEX 
Item difficulty ts a descriptive statistic measuring the ease (or difficulty) examinees* 
experience In correctly responding to the individual Items, The acceptable level 
for Item difficulties is hinged upon the basic purposes for testing, as specified In 
the test's blueprint. 

For the Mat-Sea=Cal information on respondents was desired In an area approaching 

"minimum oral proficiency" (operationally defined as 70 percent performance on- the 

p _ 

Instrument ), Further, the tests were to be criterion referenced, used as a basis 
for dl'^^gnosls and remediation. Thus, Items with difficulty Indlcles between 50 
and 90 percent appeared to be most appropriate. This v/ould permit respondents 
to exhibit both the strengths (l,e. through the easier Items) and weaknesses (l.e,, 
through the more difficult questions) In their language patterns. By concentrating 
the given number of Items within the restricted range, a more reliable portrait of 
aural-oral abilities Is obtained. 

Tables one and two present, by communication mode, the difficulty Indlcles for 
Indlcles for items of the English and Spanish Tests, respectively. 
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. (Notei the following symbols are used: LC - Listening Comprehenslonj SRep - 
Sentence Repetition; and SR - Structured Response. Numerals followed by an *'a" 
or *'b" indicate a response pair, one Item measuring two language manifestations.) 
Perusal of the tables would Indicate the following items need to be scrutinized 
critically^ 

English test: 

Listening comprehension: ^ 1 , 2 , 3 , Sa , 5b, 6 , 7 , 12 , 13a , 

13b, 14, 18a thru 23b,2Sa,25b 

Sentence repetition: 10a, lOb/ll , 14a, 14b, 17a, 17b, 19a, 

20,22a, 22b, 23a, 25 

Structured response: 7,10,11,13,19 

Spanish test: 

Listening comprehension: 1,2,3,6,8, 12,16 

Sentence repetition: 4 , 5a , Sb, 7 , 14a , 16b, 21a , 22b, 25 , 26a 

Structured response: 10,15,21,23,26 



Item # 



Table #1 
English Test 
% responding correctly. 



LC 1 


96. 


2 


46. 


3 


98. 


4 


84. 


Sa 


96. 


5b 


. 96. 


6 


* 99. 


7 


97. 


8 


74. 


9a 


87. 


9b 


87. 


10 


83. 


U 


89. 


12 


99. 


X3a 


96. 


13b 


97. 


14 


9S. 


15 


88. 
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Polnt-blsarlal correlation, 
R(l) (over all grade levels) 



.29 

.22 

.12 

.45 

.38 

.37 

.27 

.35 

.28 

.40 

.41 

.35 

.43 

.21 

.42 

.39 

.25 

.31 



English Test: 



Jtem # 



% responding correctly, 
P(l), (over all grade levels) 



Polnt-biserlal correlation, 
R(i) (over all grade levels) 
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English Test: 



% responding correctly. 



Point-blserlal correlation, 



Item I 


P(l) , (over all grade levels) 


R(i) (over all qrada levels) 


SRep 16 


89. 


.61 


17a 


95 . 


.56 


17b 


97. 


.49 ^ 


18 


94. 


. 49 ^' 


19a 


95 . 


.60 


19b 


94. 


. 60 , - ■ " 


20 


95. 


.51 ■ 


21 


91 . 


.56 


22a 


96. 


• .50 


22b 


96. 


.58 


23a 


97, 


.52 ■ 


23b 


94. 


.54 


24 


75. 


.47 


25 


97. 


.49 


26 


87. 


.58 


SR 1 1 


92. 


.15 


2 


98. 


.50 


3 


91. 


.37 


4 


92. 


.40 


S 


95. 


.38 


6 


90 . 


.43 


7 


97. 


.44 


8 


94. 


.45 


9 


90. 


.35 ' 


10 


45 . 


.25 


11 


96. 


.47 


12 


78. 


.49 


13 


39. 


.32 


14 


61 . 


.43 


IS 


73. 


.27 


16 


62. 


.38 


17 


90. 


.37 


18 


55. 


.32 


19 


35. 


.28 


20 


83. 


.33 


21 


91. 


.53 


22 


82. 


.42 


23 


79. 


.25 _ 


24 


59. 


.26 


25 


59. 


.41 


26 


61 . 


.41 


27a 


91. 


.24 


27b 


91 . 


.24 


28a 


88. 


.30 


28b 


88. 


.29 
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Spanish Test 



% responding correctly , 
P(t)i (over all grade levels) 



Polnt-blsQrial correlation, 
R(l) (over all grade levels) 
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Spahlsh Test: 



% responding correctly, 
Item # _ P(i) , (over all grade levels) 



Point -blsertal correlation 
R(l) (over all grade levels) 




■ Spanish Test: 

% responding correctly. 
Item # P(l), (over all grade leve 



SR 



1 


77. 


2 


82. 


3 


^3 . 


4 


61 , 


5 


87, 


6 


68. 


7 


73, 


8 


81, 


9 


81 . 


10 


33 . 


11 


87 


12 




13 


59 . 


14 


62 , 


IS 


38 . 


16 


74 


17 


79 * 


18 


75 . 


19 


79 


20 


68 . 


21 


39 . 


22 


64 


23 


43 


24 


66. 


25 


56. 


26 


' 44. 


27a 


80. 


27b 


,81. 


28a 


55. 


28b 


56, 



Polnt-blserlal correlation, 
R(l) (over all grade levels) 



.68 
.74 
.66 
■,52 
.72 
.60 
.63 
.69 
.72 
,30 
.61" 
.42 
.46 
.54 
.35 
.61 
.64 
.68 
.56 
.53 
,38 
.49 
.48 
,41 
.45 
.43 
.53 
.53 
.48 
.48 
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Approximateir40 percent of the English and 20 percent of the Spanish Items requlre^^^ 
* careful Investigation as a result of their difficulty Indlcles . Such figures are not 
abnormally high for Instruments In the development stage, as Is the Mat^Sea-CaL 
Further, one expects a fair proportion of Items to be modlfted betweeri field test 
and commercial forms. However, before Items are discarded or rewritten, other 
statistics, especially the Item discrimination, are examined. 

ITEM DISCRIMINATION INDEX "^^r - 
Most Instruments are designed to make distinctions between respondees, based - 
on some criterion. In statistical lexicon this Is referred to as the Item dlscrlmlnatlor 
index. It Identifies non-dlscrlmlnatlng questions on the basis of correlational 
analysis between each Item and a criterion score. The criterion measure most often 
employed Is the total score on the Instrument Itself.^ 

The total percentage score on the respective English and Spanish Tests was employed 
for analyzing the two language--ltem pools. As stated previously, all responses were/ 
scofG^^ ns either correct or incorrect. Thus, a point blsarlal coefficient was computed 
as the discrimination Index. Standards proposed by Guilford and Fruchter, and 
Ebel^^ were Invoked for Interpretation of the resulting Item-total coefficients. 
Specif iclally, items with correlations below .30 were recommended for revision, or 
exclusion from the Instrument. Items exhibiting correlations between .30 and .40 
were subject to further investigation (in the form of factor analysis). Items with 
Indlcles above .40 were regarded as sufficient in discriminating power for retention 
In the revised forms. (DLScrlminatlon indlcles are listed In Tables one and two.) ^ 

Compared to these standards, most items of the English and the Spanish Mat-Sea-Cal 
Tests were psychometrlcally acceptable. Concern may be raised with the discrimina- 
tion power of the following Items ^ 

16 - * 



Wc-r ' •• ' ^English :test: r ". : ^ v.,V ; : V ■ ' : : , / . -[ ' ')y-:r:: ^'^O^^-^-j^M 



■ V, 



■ Listening comprehension: 1,2,3,6,8,10,12,14,15,16, ■ Ov?: 

18a thru 27b ■ V 

Sentence repetition: 12 

Structured response; 1,9, 10, 13, IS, 18, 19,23, 24, 27a 

thru 28b 

Spanish test: , . . - ■ 

Listening comprehension: l,2,5a,5b,6,9a,9b, 10, Ua, lib, 

12, 14, 16, 27a, 27b 

Sentence repetition: none 

Structufed response; 10,15 ' , 
From these data It appeared that a large pool of Iteme on both English and Spanish 
Mat-^Sea-Cal Tests delineated between the orally proficient and those lacking In 
structure/cQncept^ kills. Those items in the ft ey area of dlsorlmlnatlon power. 
I.e., with Indrcies ^between .30 and .40, were subject to further analysis to ' 

- ^ ■ • - ^ ■ V 

determine their cbngruence to test purposes. 

FACTOR ANALYSIS ; - 

All Items with discrimination Indlcles of .30 and greater were Included In the variable 
pool for factor analysis. For both English and Spanish Tests (separately) on analysis 
was conducted within each communication mode/Item sequence (Listening ComprehensldnV 
Sentence Repetition, and Structured Response). 

Factoring was Intended to explore mathematical relationships among the item-variables ^ ^ 
that were not, at the time, known. Attention was focused upon latent phenomena ^ 
of the constructs under consideration, as exhibited by data generated from the Item :. ;:\ 
sequences. The end product was descriptive typologies which reflected a substantive . 
sharing of common variation among groupings of item-variables. 

. ■ ■ * ..... ^^]. 
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'V addition, the procedure was commenced for Its data red uctloh potential A large 
series Of voriablos can be "rearranged'' or "reduced" to a smaller set of source 
ipms which account for signif icant inter-relations In the data/ This possibility ^ 
was also Investigated, as a shorter Mat-Sea-Cal was desired In certain situations. 

For calculations the principal factoring method was used. Diagonal elements of the 
correlation matrlK were Initially replaced by the squared multiple cdrrelatlon 
coefficients, Elgenvarues, representing total variance accounted for by a factor/ ^ 
were computed. The number of factors extracted for rotation generally corresponded 
to Guttman's 1.0 criterion." VarlmaK rotation was employed, and subsequent 
commurallty estimates were the respective eigenvalues for each extracted factor. 1 
Iteration proceeded until convergence occurred, that is, the difference be tweeri 
successlvf eigenvalues was ,01, or less. 

All factors had to contain at least three "pure" item variables Cue. , a variable 
which loaded on one and oniy one factor) , Factor loadings of ,35 or greater were 
considered significant (Le, , the minimum correlation for an Item to load on a factor) 

In all forty "five computerized, factor-analytlc runs were made using the Statistical 
Package for the Social Sciences, j- 2 ^ summary of the findings Is presented In 
Table 3, In general, three types of Items were discerned^ those that should be 
retained In their present form, those which are relatively easy though acceptable 
In discrimination i^dwer, and those which require revision. 

It should be noted that factor analyzing of ''coupled Items", those with an "a'' 
and a "b" part, proved difficult. Both parts typically correlated highly* As a result, 
"a" and '*b" items pairs had to be analyzed separately, on different factor runs. 

On the English test the Listening Comprehension section needs the greatest amount. 



Table 3 



FACTOR PATTERNING 
English Test ... - 

Lletenlng Comprehenilon* 

factor 1: P ^ (83 - 89)*, R=(35'-44)** 
factor 2 f P^(96^97), R^(35-41) 

Goncluslonai (on individual Items) 

1. retain in present form: #4 , 9a , 9b, 10, 1 1 , and 17 

2. easy Items (high P[, acceptable Ri)i #5a,5b, 7, 13a, and 13b 

3. revise: #1-3 , 6 , 8 , 12 , 14-16 , and 18a 27b. 



Sentence Repetition: ■ \ 

fector 1: P ^ (43 - 96) , R ^ (48 ^ 62) , concept: »*number*' 
factor 2: P ^ (SS -96), R = (46 ^ 57) 

conclusions: (on individual Items) 

1, retain In present form: #1-3 ,5-9c, 13a, 13b, lSa-16, lff-21 , 

22b, 23b, 24, and 26 

2, easy Items (high Pi, acceptable R0: #4, lOa-ll , 14a, 14b^ 

17a,17b,22a,23a, and 25 

3, revise: #12 



Structured Reiponse: 

factor 1: P ^ (SS - 78), R= (33^ 49), concept* "temporality" 

factor 2: P = (90 - 98) , R = (39. ^ 54) , concept: "Identification" 

QompleK Items: P - (93 - 90), R.^ (38 - 43) ^ 

norvloadlng items: P = (35 - 39), R^ (30 - 33) 

. conclusions: (on Individual Items) 

1. retain In present form: #3,4,6,8,9,11,12,14,16-18,20-22,25, and 26 

2. easy Items (high Pi, acceptable R^): #2,5, and 7 

3. revise: #1 , 10, 13 , 15 , 19, 23, 24,and 27a-28b 



*P ^ (kk - yy): ts the range of the difficulty Index of variables Included In this 
factor, 

**R = - yy): Is the range of the discrimination Index (polnt-blserlal coefficients) 
of variables Included In this factor. 
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■ ^ Spanish Test 

Listening Comprehensions 



factor!: p=(71-92), R = (33 - 61) 
factor 2: P=(92 - 98), R = (32 - 43) 



conclusions: (on Individual Items) 



1, retain In present form: #4 ,5 b, 7, 13a, 13b, IS , 1 7, and 19a-26b 

2. easy Items (high P^, acceptable R^): #3,6,12, 18a; and 18b 
3 . revise: #1 , 2,Sa,8-nb, 14, 16, 27a, and 27b . • 



Sentence Repetition: 

factorl: P=(73 - 86), R - (55 - 68) 
factor 2: P = (24 - 59), R = (38 - 55) 

conclusions: (classification of items by difficulty Index) 

1. high group: CP=(73 - 86), R=(SS-68)]: #8a-9a, lla-12, 18a, 

19a, 19b, 23a,"and 24 

2. low group: [ P = (24 - 59) , R = (38 - 55) ] : #la, lb, 3a, 10a, 13a, 

13b, 15a, 15c, 17a, 17b, 
18b, 21b, and 23b 

3. complex group: [P=(57-81), R=(S2-73)3: #2a,4-5b, 7, 9b, 14a, 

16a,16b,21,22b,25-26b 

4. compleK/hlgh*: CP=(70-81), R = (52 - 73)] : #6, 10b, 14b, 15b, 20a 

" and 20b 

5. complex/low**: [P = (59 - 63), R= (52 - 61)3 : #2b,3b, and- 22a 
Structured Response 

- factor 1: P = (53 - 87), R= (42 - 74), concepts: "numbei-", 

"Identification", and "case relationship" 
factor 2: P=(38 - 56), R^(35 - 49) 

conclusions: (on Individual items) 

1. retain in present form: #1-9, 11-14, 16-20, 22, 24, and 26-27b 

2. difficult items (low Pi, acceptable #15 , 21 , 23 , 25 , 28a, and 28b 

3. revise: #10 



*Thase variables are "complex", though on some runs load as "pure" on the high 
factor. 

**These variables are "complex", though on some runs load as "pure" on the low 
factor, "^^ . 
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^ of work. The second half of the Listening mode (items 18a through 27b) requlres^^^^^^^^^ 
. revision, as do six other Items* Questions Sa, 5b, 7, 13a and 13b are ciasslfled 
as relatively masy, though posiesslng reasonable discrimination, Indtcies r 

The Sentence Repetition mode supported two-three factors. Items typically grouped 
according to their difficulty Index, thus the two factor solution appeared more 
appropriate. The difficulty Index of the easier item group was comparable to that " 
of similar factor In the Listening section. Also, the communication concept "number" 
completely loaded on the factor with lower difficulty Indlcles,. 

In Structured Response nine Items had low discrimination Indlcles. These need 
revision, and subsequently were npt factor analyzed. The remaining Items gravitated 
into four factors. However, the third factor repeatedly contained only two pure 
variables, and the fourth was completely composed of complex loadings. As a 
result, a two factor solution was specified, and the analysis re-parformed. The 
findings were similar to the other two modes, Items aggregating by difficulty index. 
The factor composed of relatively easy Items also contained most of the Items In 
the communication concept of Identification, The "temporality" variables accrued, 
en masse, to the other factor. 

On the Spanish test In the Listening Comprehension mode, two factors emerged. 
Again labelling of factors went according to the difficulty Index of the respective 
Items, Of note, also, the entire second half of this section is composed of Item 
pairs, (#18a through 27b) and thus had to be analyzed separately. Furthermore, 
investigating any sizable portion of either "a" or "b" pair-set with the non-paired 
items yielded a special two-factor solution. One factor contained only members 
from the paired grouped, the other included all non-paired variables. Analysis of 
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'"a" Items (18a, 19a . . ,27a), then "b'* items (18b, 19b . , ,27b), separately, 
resulted In single factor solutions. These item groups are obviously highly correlated. 
Thus, a few Itemi from the "a" and the "b" groups" may be omitted without detriment 
to assessment purposes. This would result in a shorter Listening Comprehension 
mode on the Spanish test, 

The Sentence Repetition section supported three variable groups, but only two factors. 
The factors were identifiable by difficulty Index. Items with a low percentage of 
correct responses formed one group* Items with a distinctly higher difficulty 
indlcles loaded on the second factor. The third group of variables had "complex" 
loadings, that Is, they aligned with both factors. Also, a few Items vacillated 
between variable groups on different analytic runs (and are Identified in Table 3 as 
"complex/high" and complex/low") - 

In Structured Response the discernable pattern of difficulty levels emerged. The 
two factor solution facilitated easy classification of all but three Items, Further, 
communication concept Items of number, identification, and case relationship 
loaded on the factor with whose variables possessed higher difficulty indlcles, 

SUMMARY AND RECOMMENDATIONS 
The Mat-Sea-Cal Oral Proficiency Tests, In English and In Spanish (Field Test 
Edition) , have been proposed as a means of assessing children's linguistic 
skills. The research reported herein examined certain statistical qualities of 
these Instruments i ■ - ■ ^ 

Both English and Spanish Tests satisfied psychometric standards for reliability . 
This permitted further meaningful investigation Into other characteristics of the 
instruments, as data derived from them were judged as consistent. Also, the 
magnitude of the reliability coefficients suggested that these tests were relatively 



• homogeneous In content. The method of Lntemal equivalence was employed for 
reliability calculations. . ' 

In addition, both language tests contained a large number of Items which possessed 
a desirable difficulty Index, specified as the 50 to 90 percent range. These were 
Intended as criterion referenced Instruments designed to assess fluency near the 
minimum oraT proficiency level (defined as 70 percent performance). The Spanish 
Test demonstrated a broad sampling of the target difficulty Index* Most of Its 
items were deemed acceptable. The English Test proved more homogeneous wlth^ ^ ^ 
a large concentration In the higher percentages *of the IndsK, This was particularly 
true In the Listening Comprehension mode. As a result, the expenditure of additional 
effort will be required, particularly in this one section. 

Point -blserlal coefficients were computed for an Item discrimination Indlcles* By 
and large most Items met accepted psychometric criteria on di ^'crimination power* 
The Sentence Repetition mode appeared the strongest In this matter, the Listening 

i 

Comprehension the weakest. An absolute minimum of ,30 was invoked for proceeding 
with further analysis. Items In the ,30 .40 range were rendered extra attention^ 
as such figures suggest the need for additional refinement. 

An ln--depth factor analysis constituted the final phase 3f the pursuit. Items with 
discrimination indlcles above ,30 were Included, The analysis for each te'ft was 
conducted within the three communication modes. For the analysis principal 
factoring was applied. Squared multiple correlation coefficients were Inserted as 
initial estimates of commurallty, thereafter eigenvalues. Extracted factors were 
required to have at least three "pure" loadings (of ,35 or greater). Generally^ 
two or three Item pools were discovered with each communication mode. The Items 
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» separated .themselves most often according to difficulty index. Gomplete groups of 
items representing certain communication concepts did, at times, load on one factor. 

The findings suggest that a reduction in the number of Items per test l.^ possible. 
Selection of Items would follow the test design, that is, the performance region ■■ 
In which assessment was desired. 

, Also, It was noted, that combining two language manifestations Into one question 
failed to be a discriminating technique. Item pairs were so highly .correlated that 
the magnitude of their relationship outweighed either Item's Intercorrelatlon with 
all other variables, combined. Thus, each Item needs to be a separate entity In 
future revisions of the Instruifients . 

Next, a small series of relatively easy, but discriminating. Items exist on each 
test. This raises an Interesting possibility. Such questions may be separated and 
used as a mini pre-test for children suspected of having little oral fluency In the 
given language. As the items possess discrimination power, they offer a reasonably 
accurate and objective measure. As they are relatively easy, only students with 
the largest of language deficiencies could be expected to do poorly on them. However 
for such students, an exhaustive, In^depth assessmen-; Js unnecessary' a brief, 
but accurate appraisal Is what Is required. • - 

Finally, a thorough linguistic examlnlatlon of the data is In order. The content of 
questions missed frequently, and Items rarely missed, begs scrutiny. Perhaps 
certain parts of these tests are too easy or too difficult. Or, perhaps an order of 
language skill acquisition exists, alone, or In combination with maturatlonal and/or 
environmental effects. Potential findings from such Investigations might provide 

new directions for classroom Instruction In language development, an Interesting 
thought. Indeed. 
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